MITRE at SemEval-2017 Task 1: Simple Semantic Similarity
نویسندگان
چکیده
This paper describes MITRE’s participation in the Semantic Textual Similarity task (SemEval-2017 Task 1), which evaluated machine learning approaches to the identification of similar meaning among text snippets in English, Arabic, Spanish, and Turkish. We detail the techniques we explored, ranging from simple bag-ofngrams classifiers to neural architectures with varied attention and alignment mechanisms. Linear regression is used to tie the systems together into an ensemble submitted for evaluation. The resulting system is capable of matching human similarity ratings of image captions with correlations of 0.73 to 0.83 in monolingual settings and 0.68 to 0.78 in cross-lingual conditions.
منابع مشابه
MITRE: Seven Systems for Semantic Similarity in Tweets
This paper describes MITRE’s participation in the Paraphrase and Semantic Similarity in Twitter task (SemEval-2015 Task 1). This effort placed first in Semantic Similarity and second in Paraphrase Identification with scores of Pearson’s r of 61.9%, F1 of 66.7%, and maxF1 of 72.4%. We detail the approaches we explored including mixtures of string matching metrics, alignments using tweet-specific...
متن کاملLump at SemEval-2017 Task 1: Towards an Interlingua Semantic Similarity
This is the Lump team participation at SemEval 2017 Task 1 on Semantic Textual Similarity. Our supervised model relies on features which are multilingual or interlingual in nature. We include lexical similarities, cross-language explicit semantic analysis, internal representations of multilingual neural networks and interlingual word embeddings. Our representations allow to use large datasets i...
متن کاملJmp8 at SemEval-2017 Task 2: A simple and general distributional approach to estimate word similarity
We have built a simple corpus-based system to estimate words similarity in multiple languages with a count-based approach. After training on Wikipedia corpora, our system was evaluated on the multilingual subtask of SemEval-2017 Task 2 and achieved a good level of performance, despite its great simplicity. Our results tend to demonstrate the power of the distributional approach in semantic simi...
متن کاملNeobility at SemEval-2017 Task 1: An Attention-based Sentence Similarity Model
This paper describes a neural-network model which performed competitively (top 6) at the SemEval 2017 cross-lingual Semantic Textual Similarity (STS) task. Our system employs an attention-based recurrent neural network model that optimizes the sentence similarity. In this paper, we describe our participation in the multilingual STS task which measures similarity across English, Spanish, and Ara...
متن کاملUMDeep at SemEval-2017 Task 1: End-to-End Shared Weight LSTM Model for Semantic Textual Similarity
We describe a modified shared-LSTM network for the Semantic Textual Similarity (STS) task at SemEval-2017. The network builds on previously explored Siamese network architectures. We treat max sentence length as an additional hyperparameter to be tuned (beyond learning rate, regularization, and dropout). Our results demonstrate that hand-tuning max sentence training length significantly improve...
متن کامل